I congratulate Christine Anderson-Cook and Lu on their insightful and comprehensive article on the significance of Designed Data Collection (DDC) in the analytics process. Their work highlights the importance of study design in effectively analyzing data, and I wholeheartedly agree with their perspective. A well-designed study is easy to analyze, whereas a poorly designed study may never yield valuable insights. As a statistician, I am committed to ensuring that a qualified statistician is involved in projects from the beginning throughout the data gathering and analysis process. Drs. Anderson-Cook and Lu's article is motivated by their belief that many people think the use of big data will replace the need for DDC and that the areas of big data and DDC currently need to be aligned. Many uses of big data have ignored the importance of a well-designed study, which includes DDC. However, some influential players within the field of data science recognize that large quantities of data alone are insufficient for gaining insights or making good decisions. In his article “Statistical Modeling: The Two Cultures,”1 Leo Breiman explores the emergence of two approaches to data modeling: statistical and algorithmic. He notes how the algorithmic approach has grown independently of input from the statistics community and has been used to solve complex data challenges that traditional statistical models could not handle. This has led many statisticians to question their relevance in the field. Conversely, the field of data science has often overlooked the decades of scientific research and methodology developed within the field of statistics, particularly regarding study design and data collection. However, both cultures can benefit from a stronger emphasis on DDC methods. By focusing on the importance of study design, both statisticians and data scientists can work together to ensure that the data used for analysis is of high quality and can lead to sound conclusions. Recent years have brought a growing awareness of the potential dangers of black box algorithms and the need for transparency and regulation in data science. In 2016, Cathy O'Neil's book “Weapons of Math Destruction”2 highlighted that many algorithms in use today are opaque, unregulated, and can reinforce discrimination. This has led to calls for more transparency and accountability in the data sciences. Similarly, using classical statistical methods to draw conclusions from data can pose dangers, such as the problem of “p-hacking,” where researchers only report the statistical models that best fit the data, leading to inaccurate conclusions. To address this problem, Gelman and others have recommended preregistration, where researchers identify their data collection and analysis protocols before gathering data and replication of results when possible.3 Another important aspect of data science is reproducibility. Roger Peng4 has discussed the importance of reproducibility in computational sciences, which applies to both statistical and algorithmic modeling. When replication is not easy or possible, researchers should develop their study design, conduct their analysis, and communicate their results with an aim for full reproducibility. Full reproducibility means that others should be able to rerun the same model using the same data and obtain the same results. A minimum standard for full reproducibility in computational work includes making fully documented and executable code and data available to others, with suitable privacy concerns for the data and scenario at hand. By requiring researchers and practitioners to make their code and data transparent, we can ensure better choices in the design and implementation of analytical work and enhance the trustworthiness of the results. Concerns about the scientific validity and reproducibility of analytical results are not limited to academic data scientists and statisticians but are also recognized in industry. In response to these concerns, Google has created a new discipline called “Decision Intelligence Engineering,” which aims to help data scientists make sound decisions using data and to avoid bias in the data used to train models. This field is heavily influenced by research design principles from statistics and the social sciences. Google has appointed Cassie Kozyrkov as their first-ever Chief Decision Scientist, tasked with training many of Google's employees in this field.5 The inclusion of the field of Decision Intelligence in the 2022 Gartner Hype Cycle for Artificial Intelligence6 is evidence of the growing acceptance of Google's approach to blending scientific validity with the fields of AI and machine learning. Additionally, the 2021 Gartner Hype Cycle for Artificial Intelligence estimates that by 2025, 70% of organizations will shift their focus from big to small and wide data.7 They note that leaders are turning to “small data” analytics techniques with the goal of using available data more effectively. However, for small data to provide insights, good sampling and study designs must be used to ensure that the data are representative and suitable to answer the questions or solve the problems. As an academic statistician who works with many industrial partners in data collection and analysis, I am particularly encouraged by the organization of Drs. Anderson-Cook and Lu's presentation. They specifically describe the applicability of DDC before data collection, during the collection of big data, for real-time processing, and the confirmation and improvement of results of big data analyses. Calling attention to all stages of a study encourages a life cycle approach to incorporating study design and DDC throughout the process. Interestingly, this structure maps consistently with new recommendations of a four-phase approach in process monitoring. This four-phase framework expands the traditional Phase I (retrospective analysis) and Phase II (prospective monitoring) framework. Especially when working with streaming data in Industry 4.0 applications, several authors have recommended the need for Phase 0, in which groundwork is laid for effective monitoring, including determining what characteristics to measure, establishing a sampling plan, addressing measurement systems analysis, and collecting an initial representative sample for Phase I.8-12 Adding to Phases 0, I, and II, Zwetsloot et al.12 has recommended including model maintenance in Phase III. In Phase III, one continuously evaluates if the model determined in Phase I is still appropriate and addresses issues related to model updating. A challenge to implementing DDC in practice is that many practitioners who recognize the need for sound study design and DDC need to gain the skills or training to carry out these tasks. Academics must update the existing courses on sampling and DoE to meet the challenges of modern data, big and small. Perhaps we can include in our curricula courses titled Designed Data Collection for the Data Sciences. Another challenge to implementing DDC in practice is that many of the existing sampling and experimental design methodologies do not directly relate to the complex problems we face with big data. The need for more fit between existing methodologies and real-world data provides opportunities for collaboration and research as new and improved methods for sampling, sequential data collection, and experimental design can be developed and disseminated to give practitioners the tools needed. L. Allison Jones-Farmer is the Van Andel Professor of Business Analytics at Miami University in Oxford Ohio. She received a B.S. in Mathematics from Birmingham-Southern College, an M.S. in Applied Statistics from the University of Alabama, and a Ph.D. in Applied Statistics from the University of Alabama. No data or analyses were used in developing this discussion article.